17 research outputs found
Some models are useful, but how do we know which ones? Towards a unified Bayesian model taxonomy
Probabilistic (Bayesian) modeling has experienced a surge of applications in
almost all quantitative sciences and industrial areas. This development is
driven by a combination of several factors, including better probabilistic
estimation algorithms, flexible software, increased computing power, and a
growing awareness of the benefits of probabilistic learning. However, a
principled Bayesian model building workflow is far from complete and many
challenges remain. To aid future research and applications of a principled
Bayesian workflow, we ask and provide answers for what we perceive as two
fundamental questions of Bayesian modeling, namely (a) "What actually is a
Bayesian model?" and (b) "What makes a good Bayesian model?". As an answer to
the first question, we propose the PAD model taxonomy that defines four basic
kinds of Bayesian models, each representing some combination of the assumed
joint distribution of all (known or unknown) variables (P), a posterior
approximator (A), and training data (D). As an answer to the second question,
we propose ten utility dimensions according to which we can evaluate Bayesian
models holistically, namely, (1) causal consistency, (2) parameter
recoverability, (3) predictive performance, (4) fairness, (5) structural
faithfulness, (6) parsimony, (7) interpretability, (8) convergence, (9)
estimation speed, and (10) robustness. Further, we propose two example utility
decision trees that describe hierarchies and trade-offs between utilities
depending on the inferential goals that drive model building and testing
A Deep Learning Method for Comparing Bayesian Hierarchical Models
Bayesian model comparison (BMC) offers a principled approach for assessing
the relative merits of competing computational models and propagating
uncertainty into model selection decisions. However, BMC is often intractable
for the popular class of hierarchical models due to their high-dimensional
nested parameter structure. To address this intractability, we propose a deep
learning method for performing BMC on any set of hierarchical models which can
be instantiated as probabilistic programs. Since our method enables amortized
inference, it allows efficient re-estimation of posterior model probabilities
and fast performance validation prior to any real-data application. In a series
of extensive validation studies, we benchmark the performance of our method
against the state-of-the-art bridge sampling method and demonstrate excellent
amortized inference across all BMC settings. We then showcase our method by
comparing four hierarchical evidence accumulation models that have previously
been deemed intractable for BMC due to partly implicit likelihoods. In this
application, we corroborate evidence for the recently proposed L\'evy flight
model of decision-making and show how transfer learning can be leveraged to
enhance training efficiency. We provide reproducible code for all analyses and
an open-source implementation of our method
Neural Superstatistics for Bayesian Estimation of Dynamic Cognitive Model
Mathematical models of cognition are often memoryless and ignore potential
fluctuations of their parameters. However, human cognition is inherently
dynamic. Thus, we propose to augment mechanistic cognitive models with a
temporal dimension and estimate the resulting dynamics from a superstatistics
perspective. Such a model entails a hierarchy between a low-level observation
model and a high-level transition model. The observation model describes the
local behavior of a system, and the transition model specifies how the
parameters of the observation model evolve over time. To overcome the
estimation challenges resulting from the complexity of superstatistical models,
we develop and validate a simulation-based deep learning method for Bayesian
inference, which can recover both time-varying and time-invariant parameters.
We first benchmark our method against two existing frameworks capable of
estimating time-varying parameters. We then apply our method to fit a dynamic
version of the diffusion decision model to long time series of human response
times data. Our results show that the deep learning approach is very efficient
in capturing the temporal dynamics of the model. Furthermore, we show that the
erroneous assumption of static or homogeneous parameters will hide important
temporal information
JANA: Jointly Amortized Neural Approximation of Complex Bayesian Models
This work proposes ''jointly amortized neural approximation'' (JANA) of
intractable likelihood functions and posterior densities arising in Bayesian
surrogate modeling and simulation-based inference. We train three complementary
networks in an end-to-end fashion: 1) a summary network to compress individual
data points, sets, or time series into informative embedding vectors; 2) a
posterior network to learn an amortized approximate posterior; and 3) a
likelihood network to learn an amortized approximate likelihood. Their
interaction opens a new route to amortized marginal likelihood and posterior
predictive estimation -- two important ingredients of Bayesian workflows that
are often too expensive for standard methods. We benchmark the fidelity of JANA
on a variety of simulation models against state-of-the-art Bayesian methods and
propose a powerful and interpretable diagnostic for joint calibration. In
addition, we investigate the ability of recurrent likelihood networks to
emulate complex time series models without resorting to hand-crafted summary
statistics
Learning the Likelihood: Using DeepInference for the Estimation of Diffusion-Model and Lévy Flight Parameters [Dataset]
In the corresponding paper, we use the recently develop DeepInference architecture as a general likelihood-free method to estimate parameters of cognitive models. DeepInference is a machine-learning algorithm based on the training of convolutional neural networks. In a first step, the network has to be trained with simulated data to learn the relation of parameters and data. Then, the trained network can be used to re-estimate parameters for real data. The efficiency and robustness of this approach was tested for two decision models based on continuous evidence accumulation. Study 1 investigated the recovery of parameters of the diffusion model, and Study 2 addressed the same question for a Lévy-Flight model. Results demonstrate that the machine-learning approach is superior to traditional multidimensional search algorithms that maximize the likelihood, both in terms of correlations of estimated parameters with true parameters and with regard to absolute deviations. The new approach also excels the maximum likelihood based search pertaining the robustness in the presence of contaminated data